Querying the Data Warehouse with the SQL Procedure SELECT Statement

نویسنده

  • Kirk Paul Lafler
چکیده

To appreciate the wealth of information a data warehouse has to offer, you need to understand how to communicate with it. Information is typically accessed using an SQL query. The statement responsible for querying the many tables and returning the requested results in a data warehouse environment is the SELECT statement. It is an indispensable piece of the data retrieval process. This paper looks at the structure of the typical data warehouse, the types of information stored in it, and how users access and retrieve data from it. INTRODUCTION A data warehouse plays an important role in modern computer systems. It stores and organizes operational enterprise-wide data consisting of a mix of older unintegrated "legacy" application systems. Data warehouse data is accurate as of some moment in time. Its purpose is to support management and end-user analysis and decision making. Operational data passes into a data warehouse after being captured and filtered. Only data meeting the needs of management and decision support activities is passed. As it is being passed from the operational environment, data usually undergoes a transformation process prior to being stored in the data warehouse. Structurally, data warehouse data bares no resemblance to data in the operational environment. Consequently, data redundancy is normally not a problem. THE DATA WAREHOUSE STRUCTURE A data warehouse has a unique structure as illustrated in Figure 1. It consists of current and older operational data combined with varying levels of summarizations and detailed vantage points. In Inmon and Hackathorn's book "Using the Data Warehouse", they describe data warehouse data as "data spanning a spectrum of time and the data relationships between two or more tables may be many." A data warehouse is organized with data consisting of major business areas such as customer, product, vendor, and activity. Four types of data may be stored in a data warehouse: 1) older detail data, 2) current detail data, 3) lightly summarized data, and 4) highly summarized data. Older detail data represents data that is not very recent, maybe as old as ten years or longer. It is voluminous and most frequently stored on mass storage such as tape, although more expensive disk storage may be used. Its level of detail is consistent with current detail data (see later description), but due to its longer time horizon is typically migrated to a "less-expensive" alternate storage medium. Current detail data represents data of a recent nature and always has a shorter time horizon than older detail data. Although it can be voluminous, it is almost always stored on disk to permit faster access. Lightly summarized data represents data distilled from current detail data. It is summarized according to some unit of time and always resides on disk. Highly summarized data represents data distilled from lightly summarized data. It is always compact and easily accessible and resides on disk. A final component of the data warehouse is that of metadata. Metadata is best described as data about data. Simply put, it provides information about the structure of a data warehouse as well as the various algorithms used in data summarizations. It provides a descriptive view, or "blueprint", of how data is mapped from the operational level to the data warehouse. Advanced Tutorials

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Query Construct for Paraconsistent Databases

A data model has recently been developed for representing and manipulating two kinds of uncertain information in databases: incomplete and inconsistent. The model is based on 4-valued relations, called paraconsistent relations, and has already been employed for developing elegant methods for model computation of general deductive databases. Here, we present an SQL-like SELECT statement construc...

متن کامل

Processing of Crisp and Fuzzy Measures in the Fuzzy Data Warehouse for Global Natural Resources

Fuzzy Data Warehouse (FDW) is a data repository, which contains fuzzy data and allows fuzzy processing of the data. Incorporation of fuzziness into data warehouse systems gives the opportunity to process data at higher level of abstraction and improves the analysis of imprecise data. It also gives the possibility to express business indicators in natural language using terms, like: high, low, a...

متن کامل

BtSQL: nested bitemporal relational database query language

A nested bitemporal relational data model and its query language are implemented. The bitemporal atom (BTA) is the fundamental construct to represent temporal data and it contains 5 components: a value, the lower and upper bounds of valid time, and the lower and upper bounds of the recoding time. We consider 2 types of data structures for storing BTAs: 1) string representation and 2) abstract d...

متن کامل

Speeding Up Materialized-View Maintenance Using Cheap Filters at the Warehouse

We consider the problem of speeding up the incremental maintenance of materialized views defined by conjunctive queries (CQ) over external base relations, when querying these base relations is expensive. Our approach consists of detecting, without using the base relations, situations where a view either is not affected by a base update (VDU) or can be maintained using only the views in the ware...

متن کامل

Attribute Oriented Induction with simple select SQL statement

Searching learning or rules in relational database for data mining purposes with characteristic or classification/discriminant rule in attribute oriented induction technique can be quicker, easy, and simple with simple SQL statement. With just only one simple SQL statement, characteristic and classification rule can be created simultaneously. Collaboration SQL statement with any other applicati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998